Model Selection

GRPO Fine-tuning

# GRPO Fine-tuning

GRPO VI Qwen2 7B RAG

Vietnamese Retrieval-Augmented Generation (RAG) specialized large model fine-tuned based on Qwen2.5-7B-Instruct, trained using GRPO optimization method

Large Language Model

Transformers Other

Xiyansql QwenCoder 7B 2504

A fine-tuned SQL generation model based on QwenCoder, supporting multiple dialects with excellent performance

Text Generation

Safetensors Supports Multiple Languages

Nano Aha Moment 3b

A 3-billion-parameter language model trained with reinforcement learning for solving mathematical reasoning tasks, especially countdown games.

Large Language Model

Gemma 3 4b Reasoning

Gemma-3-4b Reasoning is a Transformer-based language model fine-tuned using the GRPO method, specializing in reasoning task optimization.

Large Language Model

Transformers English

Medqwen3b Reasoner

A medical domain-specific model based on Qwen2.5-3B-Instruct, excelling in medical reasoning and mathematical problem-solving

Large Language Model English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase